Search CORE

113 research outputs found

MegDet: A Large Mini-Batch Object Detector

Author: Jia Kai
Jiang Yuning
Li Zeming
Peng Chao
Sun Jian
Xiao Tete
Yu Gang
Zhang Xiangyu
Publication venue
Publication date: 11/04/2018
Field of study

The improvements in recent CNN-based object detection works, from R-CNN [11], Fast/Faster R-CNN [10, 31] to recent Mask R-CNN [14] and RetinaNet [24], mainly come from new network, new framework, or novel loss design. But mini-batch size, a key factor in the training, has not been well studied. In this paper, we propose a Large MiniBatch Object Detector (MegDet) to enable the training with much larger mini-batch size than before (e.g. from 16 to 256), so that we can effectively utilize multiple GPUs (up to 128 in our experiments) to significantly shorten the training time. Technically, we suggest a learning rate policy and Cross-GPU Batch Normalization, which together allow us to successfully train a large mini-batch detector in much less time (e.g., from 33 hours to 4 hours), and achieve even better accuracy. The MegDet is the backbone of our submission (mmAP 52.5%) to COCO 2017 Challenge, where we won the 1st place of Detection task

arXiv.org e-Print Archive

Crossref

Mitigating Label Biases for In-context Learning

Author: Bosselut Antoine
Chen Zeming
Fei Yu
Hou Yifan
Publication venue
Publication date: 28/05/2023
Field of study

Various design settings for in-context learning (ICL), such as the choice and order of the in-context examples, can bias the model's predictions. While many studies discuss these design choices, there have been few systematic investigations into categorizing them and mitigating their impact. In this work, we define a typology for three types of label biases in ICL for text classification: vanilla-label bias, context-label bias, and domain-label bias (which we conceptualize and detect for the first time). Our analysis demonstrates that prior label bias calibration methods fall short of addressing all three types of biases. Specifically, domain-label bias restricts LLMs to random-level performance on many tasks regardless of the choice of in-context examples. To mitigate the effect of these biases, we propose a simple bias calibration method that estimates a language model's label bias using random in-domain words from the task corpus. After controlling for this estimated bias when making predictions, our novel domain-context calibration significantly improves the ICL performance of GPT-J and GPT-3 on a wide range of tasks. The gain is substantial on tasks with large domain-label bias (up to 37% in Macro-F1). Furthermore, our results generalize to models with different scales, pretraining methods, and manually-designed task instructions, showing the prevalence of label biases in ICL.Comment: Accepted to ACL 202

arXiv.org e-Print Archive

Study of Energetic Particle Induced Biological Effect through FTIR and Raman Micro-Spectroscopy

Author: Fang Yusheng
Huang Qing
Ke Zhigang
Liu Jinghua
Qi Zeming
Wei Xiaoli
Yu Zengliang
Publication venue: Biophysical Society. Published by Elsevier Inc.
Publication date: 31/01/2012
Field of study

Elsevier - Publisher Connector